Skip to content

Fix a bug in extract_best_per_route kernel#156

Merged
rapids-bot[bot] merged 2 commits intoNVIDIA:branch-25.08from
rg20:multiple_insert_bug_fix
Jun 27, 2025
Merged

Fix a bug in extract_best_per_route kernel#156
rapids-bot[bot] merged 2 commits intoNVIDIA:branch-25.08from
rg20:multiple_insert_bug_fix

Conversation

@rg20
Copy link
Contributor

@rg20 rg20 commented Jun 27, 2025

Description

The mentioned kernel does not require any dynamic shared memory, however, we are passing sh_size that is relevant for the previous kernel in the code. In most cases it is fine. However, if the sh_size is more than the shared memory available, the kernel fails to launch with cudaInvalidValue error. The previous kernel has no issues because the dynamic shared memory is tuned for that kernel using cudaFuncSetAttribute call.

This PR passes zero as the dynamic shared size to resolve the issue.

Issue

This is a bug reported by a customer.

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

@rg20 rg20 requested a review from a team as a code owner June 27, 2025 14:13
@rg20 rg20 requested review from akifcorduk and chris-maes June 27, 2025 14:13
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jun 27, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@rg20 rg20 added bug Something isn't working non-breaking Introduces a non-breaking change labels Jun 27, 2025
@rg20 rg20 added this to the 25.08 milestone Jun 27, 2025
@rg20
Copy link
Contributor Author

rg20 commented Jun 27, 2025

/ok to test e5ff891

@rg20
Copy link
Contributor Author

rg20 commented Jun 27, 2025

/merge

@rapids-bot rapids-bot bot merged commit 89aef1b into NVIDIA:branch-25.08 Jun 27, 2025
141 of 142 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants